Voice Activity Detection using Group Delay Processing on Buffered Short-term Energy
نویسندگان
چکیده
In this paper, we present an algorithm for Voice Activity Detection (VAD) in speech signals using the minimum phase group delay function. The proposed method considers a buffer consisting of contiguous frames of the given signal and computes the short-term energy (STE) for that buffer. By appending a surrogate signal to STE and viewing the resultant signal as a positive part of the magnitude spectrum of an arbitrary signal, the minimum phase group delay function is computed. The group delay is then noise compensated and median filtered. The regions having positive group delay values are classified as speech and those with negative values are classified as noise. Experimental comparisons with the G.729 Annexe B VAD algorithm demonstrates significantly better performance for the proposed method, revealing that the algorithm is robust to noise.
منابع مشابه
A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملVoice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملGrid Impedance Estimation Using Several Short-Term Low Power Signal Injections
In this paper, a signal processing method is proposed to estimate the low and high-frequency impedances of power systems using several short-term low power signal injections for a frequency range of 0-150 kHz. This frequency range is very important, and thusso it is considered in the analysis of power quality issues of smart grids. The impedance estimation is used in many power system applicati...
متن کاملApproach for Energy-Based Voice Detector with Adaptive Scaling Factor
This paper presents an alternative energy-based algorithm to provide speech/silence classification. The algorithm is capable to track non-stationary signals and dynamically calculate instantaneous value for threshold using adaptive scaling parameter. It is based on the observation of a noise power estimation used for computation of the threshold can be obtained using minimum and maximum values ...
متن کاملVoiced/Unvoiced Detection using Short Term Processing
A new method for identifying voiced and unvoiced speech region is proposed. Voiced/unvoiced speech detection is needed to extract information from the speech signal and it is important in the area of speech analysis. Voiced and unvoiced speech region has been identified using Short Term Processing (STP) in this paper. Short Term Processing of speech has been performed by viewing the speech sign...
متن کامل